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, The allocation problem for multivariate stratified random sampling as a problem of 

stochastic matrix integer mathematical programming is considered. With these aims 
the asymptotic normality of sample covariance matrices for each strata is established. 
Some alternative approaches are suggested for its solution. An example is solved by 
^ , applying the proposed techniques. 
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1 INTRODUCTION 

Not long ago, multivariate analysis was mainly based on linear methods illustrated on 
r> ■ small to medium-sized data sets. However, many novel developments, have permitted the 

, introduction of several innovative statistical and mathematical tools for high-dimensional 

data analysis. Developments as generalised multivariate analysis, latent variable analysis, 
DNA microarray data, pattern recognition, multivariate nonlinear analysis, data mining, 
manifold learning, shape theory etc., have given a new and modern image to Multivariate 
Analysis. 

One of the topics of statistical theory that is most commonly used in many fields of 
scientific research is the theory of probabilistic sampling. From a multivariate point of 
view, diverse authors have studied the problem of optimum allocation in multivariate strat- 
ified random sampling. Arthanari and Dodge (f981) and Sukhatme et al. (f984), among 
many others, proposed the problem of optimum allocation in multivariate stratified random 
sampling as a deterministic multiobjective mathematical programming problem, by consid- 
ering as objective function a cost function subject to restrictions on certain functions of 
variances or viceversa, i.e., considering the functions of variances as objective and subject 
to restrictions on costs. Noting that, for the case when the function of costs is taken as 
the objective function, the problem of optimum allocation in multivariate stratified ran- 
dom sampling is reduced to a classical uniobjective mathematical programming problem. 
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Furthermore, Diaz-Garcia and Ulloa (2008) propose the optimum allocation in multivariate 
stratified random sampling as a deterministic nonlinear problem of matrix integer math- 
ematical programming constrained by a cost function or by a given sample size. Also, 
Prekopa (1978) and Diaz-Garcia and Garay (2007) observe that the values of the popu- 
lation variances are in fact random variables and formulate the corresponding problem of 
optimum allocation in multivariate stratified random sampling as a stochastic mathematical 
programming problem. 

In this paper, the optimum allocation in multivariate stratified random sampling is 
posed as a stochastic matrix integer mathematical programming problem constrained by a 
cost function or by a given sample size. Section 2 provides notation and definitions on mul- 
tivariate stratified random sampling. Section 3 studies in detail the asymptotic normality 
of the sample mean vectors and covariance matrices. The optimum allocation in multivari- 
ate stratified random sampling via stochastic matrix integer mathematical programming is 
given in Section 4. Also, several particular solutions are derived for solving the proposed 
stochastic mathematical programming problems. Finally, an example of the literature is 
given in Section 5. 



2 PRELIMINARY RESULTS ON MULTIVARIATE STRATIFIED RAN- 
DOM SAMPLING 

Consider a population of size A^, divided into H sub-populations (strata). We wish to 
find a representative sample of size n and an optimum allocation in the strata meeting 
the following requirements: i) to minimise the variance of the estimated mean subject to a 
budgetary constraint; or ii) to minimise the cost subject to a constraint on the variances; 
this is the classical problem in optimum allocation in univariate stratified sampling, see 
Cochran (1977), Sukhatme et al. (1984) and Thompson (1997). However, if more than 
one characteristic (variable) is being considered then the problem is known as optimum 
allocation in multivariate stratified sampling. For a formal expression of the problem of 
optimum allocation in stratified sampling, consider the following notation. 

The subindex h = 1,2, ••• , denotes the stratum, i = 1,2,- •• ^N^ or rih the unit 
within stratum h and j = 1, 2, • • • , G denotes the characteristic (variable). Moreover: 

Nh Total number of units within stratum h. 

rih Number of units from the sample in stratum h. 

Yh = (Y^, ■ • ■ ,Y^) Nh X G population matrix in stratum h; Y^i is the 

= (Yhi, ■ ■ • ,YhMh)' G-dimensional value of the i-th unit in stratum h. 

Yh ~ {yj,, ■ ■ ■ ,yh) "-h ^ G sample matrix in stratum h; ym is the G-dimensional 

~ (yhi, ■ ■ ■ , y/m,, )' G-dimensional value of the i-th unit of the sample in stratum h. 

yj^^ Value obtained for the i-th unit in stratum h 

of the j-th characteristic 

n = (ni, • ■ • , uh)' Vector of the number of units in the sample 

Wh = -j^ Relative size of stratum h 

-J 1 ""'^ 

= j/^^ Population mean in stratum h of the j'-th characteristic. 

Yh = {Yh, ■ ■ ■ , Yh )' Population mean vector in stratum h. 

yj^ = — Sample mean in stratum h of the j-th characteristic. 

^ i — l 

Yh ~ iVh, ' ' ' ^VhY Sample mean vector in stratum h. 
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^ ST 



V = f# ... yG y 



Estimator of the population mean in multivariate 

stratified sampling for the j-th characteristic. 
Estimator of the population mean vector in 
multivariate stratified sampling. 
Covariance matrix in stratum h 

1 - - , 

Sh = '^{yhi - Yh)(yfti - Yfe)' 

^ i — 1 

where Sh^^ is the covariance in stratum h of the 
j-th and fc-th characteristics; furthermore 

1 

^t^jk = jf ^{yii-yDiyhi -vl), and 

i— 1 

Sh,, = slj = — - yif- 

Estimator of the covariance matrix in stratum 

Wh - . , 
I— 1 

where Sh^^ is the sample covariance in stratum h of the 
j-th and fc-th characteristics; furthermore 
1 

1 = 1 



1 

I r . ^ 



Y^^yL-yM' -th), and 

■ =1 



Covariance matrix of y 

Estimator of the covariance matrix of y , 

ST ' 

it is denoted as Cov(y^^) = Cov(y^^), and defined as 



Cov(y2^,yi^ 



Var(y^^ 



G 

ST I 



\ C^HVst'Kt) Cov(y^^,y2^) 



Var(y^^ 



) / 



E 



iV 



Estimated covariance of y-' and v where 

^ ST ^ ST 

C^v{yl^,yl^) = CoviVl^yl^), with 



Cov(yi^^,y^.^) = 2^. 



h = l 



Cov{y 



; Var(y 



ST ' 



E 



N 

hj 



and 



E 



Whs"" 



hj 



,ST^^ST' —y^ST' ^ ^ ^ TV ' 

h=l h=l 

Cost per G-dimensional sampling unit in stratum h and let 
c = (ci, . . . ,cg)'. 

Where if a S K*^, a' denotes the transpose of a. 



3 LIMITING DISTRIBUTION OF SAMPLE MEANS AND COVARIANCE 
MATRICES 

In this section the asymptotic distribution of the estimator of the covariance matrix s/j and 
mean y/j is considered. With this aim in mind, the multivariate version of Hajek's theorem 
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is proposed in the context of sampling theory in terms of the extension stated in Hajek 
(1961). First, consider the following notation and definitions. 

A detailed discussion of operator "vec" , "vech" , Moore- Penrose inverse, Kronecker prod- 
uct, commutation matrix and duplication matrix may be found in Magnus and Neudecker 
(1988), among many others. For convenience, some notations shall be introduced, although 
in general it adheres to standard notations. 

For all matrix A, there exists a unique matrix A'^ which is termed the Moore-Penrose 
inverse of A. 

Let A be an m X n matrix and B a p x g matrix. The mp x nq matrix defined by 



aiiB 



aiiB 



oiiB 



oiiB 



is termed the Kronecker product (also termed tensor product or direct product) of A and 
B and written A (g) B. Let C be an m x n matrix and Cj its j-th column, then vec C is the 
mn X 1 vector 

' Ci ' 

vec C 



The vector vec C and vec C clearly contain the same mn components, but in different order. 
Therefore there exist a unique mn x mn permutation matrix which transform vec C into 
vec C. This matrix is termed the commutation matrix and is denoted Km,„. (If m = n, is 
often write K„ instead of K^n-) Hence 

Kmn vec C = vec C'. 

Similarly, let B be a square nxn matrix. Then vechB (also denoted as v(B)) shall denote 
the n(n + l)/2 x 1 vector that is obtained from vecB by eliminating all supradiagonal 
elements of B. If B = B', vechB contains only the distinct elements of B, then there is 
a unique n? x n(n + l)/2 matrix termed duplication matrix, which is denoted by D„, such 
that D„ vechB = vecB and vecB = vechB. Finally, denote (vechB)' = vech'B. 

In what follows, from Lemma 3.1 through Theorem 3.2, asymptotic results are stated 
for a single stratum. The notation Ni, and rij/ denote the size of a generic stratum and the 
size of a simple random sample from that stratum. 

Lemma 3.1. Let Si, be a G x G symmetric random matrix defined as 

^-^(y,i-Y,)(y,i-Y,)'. 



1=1 



Suppose that for A = (Ai, . . . , A^)', any vector of constants, k = G{G + l)/2, 



vechSiy vech' SS) A > e max 

' l<a<k 



{ml - vechS^ vech' S^) 



(1) 
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where = (0, . . . , 0, 1, 0, . . . , 0)' is the a-th vector of the canonical base of K'^, e > and 
independent of u > \ and 



G 



,i=l 



D 



+' 
G ' 



is the fourth central moment. Assume that Uy — )• oo, Ni, — — )• oo, Ny — )• oo, and that, 
for all j = 



n 



li,„ 1^1=0 



> lim 

1^— >oo 



max > 

l<il<-<in^<Nv ^ 



, \ 2 



Y,.\ -SL 



y] 



mf„: — ( S: 



(2) 



where 



m 



i=l 



Then, vechHjy is asymptotically normally distributed as 

vech Hj, A- 7\4(E(vech Hi,), Cov(vech 3;,)), 

with 



E(vech Sy) 



vech Sjy , 



and 



Covfvech Hjy) = ^ (M^ — vech S,, vech' Sp) . 

(n,, - 1)^ ' ' 



{riu - 1)' 

is the sample size for a simple random sample from the v-th population of size Ny. 



(3) 
(4) 



Remark 3.1. Let 



Hence, 



vec L 



1 riv 

V(y,, - Yy){yyi - Yy)'. 

vec(y^i - Yy){yyi - Yy)' 

1=1 

r V(y,.i - Y^) (y^i - Y^). 

n,, - l^ 



From where 



vech H,, 



k = G(G + l)/2. 

Taking m = 
it is obtained that: 



Taking m = k and Siyi = (a^^, . . . , a^J' = D+(y,,i - Yy 



— D+(y^i - Y^) ® {yyi - Yy), 

^ (y^i - Y^) in Hajek (1961), 



1=1 
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vech Si, can be expressed as 



vech 3,, = ^ b„iaL^R^^ . 



i=l 



with b's fixed, furthermore bi^i = ■ ■ ■ = 
Then 

max {buj-bu) 
hm ~ .7 " = 0, where b^ = b^ 



i=l 



i=l 



holds if 111, —7- oo, Niy — — )• oo. 
) = (oi • • -oj;)' is 



1=1 



vech — ^{yui - Y,,)(y,,i - Yy)' 



vech S,, 



) From (7.2) in Hajek (1961) 



E 

1=1 



.a=l 



> e max 

l<a<A: 



a\2 



j=l 



In the context of samphng theory the right side in (5) can be written as 



E 

1 = 1 



.0=1 



J2 {A' [D+(y,, - Y,) (y,i - Y,) - vech S,] }' 



i=l 



A' [D+(y,, - Y,) (y„ - Y,) - vech S,] 



1=1 



[(y.i - Y,)' ® (y,, - Y,)'D+' - vech' 



Ny 



J^(y.i - Y,)(y,, - Y,)' (y,, - Y,)(y,, - Y,)'D+' 



i=l 



Ny 



- vech Y^^Yui - Y^)' (y^, - Y^)'D+' 



-D^ ^(yi/i - Yi.) (g) (yi,j - Y,^) vech' 8,^ + Ni, vech Su vech' Sj. 



i=l 



N^\' (Mt - vech vech' S^) A, 



where is 

1 



— D+ 



'd+', (7) 



Similarly the right side of (5) is 
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A^5](a^i-<)' = j;{A'e^^'[D+(y.,-Y.)0(y.,-Y,)-vechS, 

i=l i=l 

2 

= A^^je^' [D+(y,i-Y,)^5(y..-Y,)-vechS,]} . 

i=l 

Then, proceeding as in 3., 

E« - <)' = ^fc' i^t - vech vech' S,) e^. (8) 

i=l 

Therefore, from (6) and (8), (1) is established. 

iv) The expression for (2) is found analogously as the procedure described in item 3. 

v) Finally, 

E(vechH) = T VED+(y^, - Y^)^(y^i-Y^) 

1 = 1 

= T V vech EiYui - Y^)(yi.^ - Y^)' 

n,/ - 1 ^ 

vech S„ 



1=1 

-^^—r vech S,, 



ny-\ 

Similarly, by independence 

Cov(vech H) = -2 Gov [D+(y,, - Y,) ® (y,, - Y,)] 

= TTi E |e fD+(y,, - Y,) ® (y,, - Y,)(y,, - Y,)' ® (y,, - Y,)'D 

- E [D+(y,, - Y,) ® (y,, - Y,)] E [(y.., - Y,)' ® (y,, - Y,)'D+'] } 
= (;^— 1)2 E (M'. - vech S. vech' S.) 
= (M^ - vech vech' S,) , 



the last expression is obtained observing that 

E [D+(y^i - Y^) {y^i - Y^)] = vech E [{yui - Y^)(y^i - Y^)'] = vech 
and that 

E {D+(y,, - Y,)(y,, - Y,)' (y,, - Y,)(y,i - Y,)'D+'} = 

where is defined in (7). □ 

Theorem 3.1. Under assumptions in Lemma 3.1, the sequence of sample covariance ma- 
trices Sjy are such that vech s^, has an asymptotic normal distribution with asymptotic mean 
and covariance matrix given by (3) and (4), respectively. 

Proof. This follows immediately from Lemma 3.1, only observe that 

— r "^{yiyi - yu){yui - Yu) 

%- 

-(y,-Y.)(y,-Y,y, 



1=1 



where 



1 and (yj^ — Yj^)(yj, — Yjy)' — )• in probability. □ 



□ 



Remark 3.2. Observe that it is possible to find the asymptotic distribution of vecSj^, but 
this asymptotic normal distribution is singular, because Cov(vecSjy) is singular. This is due 
to the fact Cov(vecS;y) is the x covariance matrix in the asymptotic distribution 
distribution of vec Si/ and, because s^^ is symmetric, then vec s^^ has repeated elements. In 
this case, vecs,y is asymptotically normally distributed as (see Muirhead (1982)) 



where 



and 



= — 



vecsj, 7VG'2(E(vec H,^), Cov(vec Hj,)), 



E(vec Hj,) = vec Su, 

Uy-l 

Cov(vec H;^) = - — (9Jt^ - vec Si, vec' S^) , 
[ny - 1)^ ' 

Y,{yu^ - Y,)(y,i - Y,)' ® (y,, - Y,){Y,i - Y,)' 



□ 



J=i 

Proceeding in analogous way as in Lemma 3.1 and Remark 3.1, it is obtained: 
Theorem 3.2. Suppose that for A = (Ai, . . . , Ac)', any vector of constants, 



A'S.A > 6 max^ [A^ S^J . (9) 
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Assume that n,y — )■ oo, N^, — Uy ^ oo, — )■ oo, and that 



lim — = 



max 

l<ii<-<in^<N„ 



> lim 

I/— >oo 



13=1 



0, 



Then, is asymptotically normally distributed as 

AaAg(Y,,S,). 

Uu is the sample size for a simple random sample from the u-th population of size Ny. 

As direct consequence of Theorem 3.1 it is obtained: 
Theorem 3.3. Let Covi^^^) he the estimator of the covariance matrix ofygx; then 

vechCov(ygj,) = 2^ ( — — ] vechs/. 



(10) 



h=l 



Uh N 



is asymptotically normally distributed; furthermore 



where 



vechCov(y ) A4 (E ( vechCov(y ) ) , Gov ( vechCov(y 



■ vechS/i, 



H , 

E(vechC^(y^^)) = ( 
h=i ^ 



Uh N J Uh-l 



(11) 
(12) 



Gov (vechCov(y ) 



(13) 



and 



1 



Ml = — D%, 



Observe that the asymptotic means and covariance matrices of the asymptotically nor- 
mality distributions of y^j, vechS/i, vecCov(yg^) and vechGov(yg^) are in terms of the 
populations parameters Yh, vechS/i, OJl^ and M^; then, from Rao (1973, iv), pp. 388-389), 
approximations of asymptotic distributions can be obtained using consistent estimators 
instead of population parametrers. In what follows, the following substitutions are used: 



where 



and 



Y/i — )• y/i, vech Sh vechs/i, Tt^ — > and — )• 

'^{yhi - yh){yhi - fh)' ® (yhi - yh){yhi - yn)' D 

i=l 

^{yhi - yh)iyhi - Yh)' (yhi - yh){yhi - y^) 



(14) 



4 1 T-k- 



rih 



G ' 



,i=l 
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4 OPTIMUM ALLOCATION IN MULTIVARIATE STRATIFIED RANDOM 
SAMPLING VIA STOCHASTIC MATRIX MATHEMATICAL PROGRAM- 
MING 

When the variances are the objective functions, subject to certain cost function, the opti- 
mum allocation in multivariate stratified random sampling can be expressed as the following 
matrix mathematical programming using a deterministic approach 

min Cov(y ) 

n ■^^ 

subject to 

c'n + Co = C (15) 
2<nh<Nh, h = l,2,...,H 

rih G N, 

where N denotes the set of natural numbers. (15) has been studied in detail by Diaz-Garcia 
and Ulloa (2008). 

Observing that Cov{y^^) is in terms of Sh ,., which are random variables, the optimum 
allocation of (15) via stochastic mathematical programming can be stated as the following 
stochastic matrix mathematical programming, see Prekopa (1995) and Stancu-Minasian 
(1984), 

min Cov(y ) 

n ^-^ 

subject to 
c'n + co = C 

2<nh<Nh, h = l,2,...,H ^^''^ 
vech Cov(ygy ) A Mk (e (vedi Cov(y^^)^ , Gov (vedi Cov{y^^)^ ^ 

rih G N, 

where E ^vech Cov(yg^)^ and Gov ^vech Gov(yg^)^ are given by (12) and (13) respectively. 

Observe that Gov(y^^) is an explicit function of n, and so it must be denoted as 
Cov(ygy) = Gov(yg^(n)). Also, assume that Gov(yg^(n)) is a positive definite matrix 
for all n, Gov(y^^(n)) > 0. Now, let ni and n2 be two possible values of the vector n and, 
recall that, for A and B positive definite matrices, A>B<;4>A — B>0. 

Then, proceeding as Diaz-Garci'a and Ulloa (2008) the stochastic solution of (16) is 
reduced to the following stochastic uniobjective mathematical programming problem 



min / {Cov{y^^) 
subject to 

c'n + co = C , . 
2<nh<Nh, h = l,2,...,H ^ 

vech Gov(y^^) -4 Mk (e (vech Gov(y^^)'j , Gov ('vech Gov(y^^) 

Uh G N, 



where the function / is such that: / : 5 — t- K, 

cS^(y^^(ni)) < C^(y^^(n2)) ^ f (c^(y,^(ni))) < / (c^(y^,^(n2)) ) . (18) 
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/ (Cov(y^ 



with Cov(yg^(n)) £ S C and 5 is the set of positive definite matrices. 

Unfortunately or fortunately the function /(•) is not unique. Same alternatives for 

g^(n))^ are tr(-), |-|, Amax(-)) where Amax is the maximum eigenvalue, Amm(-)> 
where Amin is the minimum eigenvalue, Xj (•), where Xj is the j-th eigenvalue, among others. 

Note that (17) is a stochastic uniobjective mathematical programming then, any tech- 
nique of stochastic uniobjective mathematical programming can be applied, for example: 

Point n G is the expected modified value solution to (17) if it is an efficient solution 
in the Pareto^ sense to following deterministic uniobjetive mathematical programming 
problem 



(19) 



min h E (/ (c^Hy st))) + Var (/ (c^(y 

subject to 
c'n + Co = C 
2<nh<Nh, h = l,2,...,H 

rih G N, 

Here ki and k2 are non negative constants, and their values show the relative importance of 
the expectation and the covariance matrix Cov(ygj,). Some authors suggest that ki+k2 = 1, 
see Rao (1979, p. 599). Observe that if ki and k2 are such that ki = 1 and A;2 = in (19), 
the resulting method is known as the E-model. Alternatively, if /ci = and k2 = 1, the 
method is called the V-model, see Charnes and Cooper (1963), Prekopa (1995) and Uryasev 
and Pardalos (2001). 

Alternatively, the point n G is a minimum risk solution of the aspiration level r to 
the problem (17) (also termed P-model, see Charnes and Cooper (1963)) if its is an efficient 
solution in the Pareto sense of the uniobjetive stochastic optimization problem 

minP (/ (C^iy,^)) < T 
subject to 

c'n + co = C (20) 
2<nh<Nh, h = l,2,...,H 
rih G N. 

In Section 5 the solution is studied for the case when / = tr ^Cov(y^^)^ and the case 



Cov(y ) . These solutions are implemented in the context of problems (19) 



when / = 
and (20). 

H 

Finally, note that so far, the cost constraint Chn^ + cq = C has been used in every 

h=l 

stochastic mathematical programming method. However, in diverse situations, this cost 
restriction could represent existing restrictions on the availability of man-hours for carrying 
out a survey, or restrictions on the total available time for performing the survey, etc. These 
limitations can be established by using the following constraint, see Arthanari and Dodge 



'^For the sampling context, observe that in matrix mathematical programming problems, there rarely 
exists a point n* which is considered as a minimum. Alternatively, it say that /*(x) is a Pareto point 
of /(n) = (/i(n), . . . , /G(n))', if there is not other point /^(n) such that /^(n) < /*(n), i.e. for all j, 
/}(n) </;(n) and /^(n) / r(n). 
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Table 1: Variances, covariances and the number of units within each stratum 



Variance 



Stratum 


Nh 


BA 


Vol. 


Covariance 


1 


11 131 


1 557 


554 830 


28 980 


2 


65 857 


3 575 


1 430 600 


61 591 


3 


106 936 


3 163 


1 997 100 


72 369 


4 


72 872 


6 095 


5 587 900 


166 120 


5 


78 260 


10 470 


10 603 000 


293 960 


6 


51 401 


8 406 


15 828 000 


357 300 


7 


24 050 


20 115 


26 643 000 


663 300 


8 


46 113 


9 718 


13 603 000 


346 810 


9 


102 985 


2 478 


1 061 800 


39 872 



^1981): 

H 



n. 

h=l 



5 APPLICATION 

The input information was taken from Arvanitis and Afonja (1971) in which they describe 
a forest survey conducted in Humbolt County, Cahfornia. The population was subdivided 
into nine strata on the basis of the timber volume per unit area, as determined from aerial 
photographs. The two variables included in this example are the basal area (BA)^ in 
square feet, and the net volume in cubic feet (Vol.), both expressed on a per acre basis. 
The variances, covariances and the number of units within stratum h are listed in Table 1. 
For this example, the matrix optimisation problem under approach (17) is 



minfl ^""'^ylr) Cov(y^^,y2^) 
- ' Cov(y2^,yi^) Var(y2/ 
subject to 



9 

^nh = 1000 (21) 

h=l 

2<nh<Nh, /i = l,...,9 
vech Cov(yg^) -4 TVs (^vedi Cov(yg^)^ , Gov ^vech Cov(y^^ 

rih G N. 

5.1 Solution when /(•) = tr( ) 
Note that by (11), (12) and (13) 

tr Gov (yst) ~ a/" (E (tr Gov (yst)) , Var (tr Gov {yst))) 



■^In forestry terminology, 'Basal area' is the area of a plant perpendicular to the longitudinal axis of a 
tree at 4.5 feet above ground. 
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where 



2 



j=l h=l 
G H ,„r2 



Va.(.c-v,y„)).EE(^-^) 



nh 



j=lh=l ^ J \ n } 



and 



"^^^ = Wh 



i=l 



Therefore, considering the substitutions (14), the equivalent deterministic uniobjetive math- 
ematical programming problem to stochastic mathematical programming (21) via the mod- 
ified E'-model is 



niin kiE [tr Cov{y^^)j + /c2y Var ( tr Cov(y 
subject to 

9 

^nh = 1000 

h=l 

2<nh<Nh, /i = l,2,...,! 

rih e N, 



where 



2 9 



E [tr Cov(yg^ 
Var ( tr Cov(y 



j=i h=i 



rih 



<-islf 



and 



jri^,\rih N J [uh-i)^ V ^ 
1 r , 



4 1 
^h, = 

' nh 



■\ 4 



,4 = 1 



(22) 

(23) 

(24) 



Remark 5.1. Observe that the estimators yj^, s^. and m^. of y^j, S^. and M^, are initially 
obtained as 

i) a consequence of a pilot study (or preliminary sample) or 

ii) using the corresponding values of the estimators of another variable X correlated to 
the variable Y. 

It is important to have this in mind in the the minimisation step, because for example, the 
n/i's that appear in expression (24), are the fixed n/^'s values used in the pilot study. Same 
comment for the expressions of the estimator and s^ ,. While the n/^'s that appear in 
expressions (22) and (23) are the decision variables. □ 
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Similarly, proceeding as in Diaz Garcia et al. (2005), and noting that, if $ denotes the 
distribution function of the standard Normal distribution, the objective function in (21) 
with /(•) = tr(-) can be written as 



min <1> 



'^T-E(trC^(y^^)^^ 



Var (trCov(y ) 



In this way, since minimising the monotonically increasing distribution function is equivalent 
to minimising the value of the associated random variable, the equivalent deterministic 
problem to the stochastic mathematical programming (21) via the P-model is 



T-Eftr Cov(y_) 



Var (tr Cov(y 



mm 



subject to 

9 

^nh = 1000 

h=l 

2<nh<Nh, /i = l,2,...,9 

rih e N, 

Remark 5.2. When /(•) = | • |, this approach consider the following alternative stochastic 
matrix mathematical programming problem 

min Cov(y ) 

n 

subject to 

9 

Uh = 1000 

h=l 

2<nh<Nh, /i = l,2,...,9 
vechCov(yg^) A M2X2 (^vech 02x2, Gov (^vech Cov(y^,j 

Uh G N, 



(25) 



where Cov{y^^) = vech 
function of function vech. 
In this way (20) is 



vechCov(yg^) — E ^vech Cov(yg^)^ and vech ^ is the inverse 



mm 

n 



Cov(y^^) 

subject to 
9 

J]]reft = 1000 

h=l 

2<nh<Nh, /i = l,2,...,9 
vechCov(ygj,) A M2X2 vech 02x2, Gov (^vechCov(y^ 

Uh G N, 



(26) 
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Thus, taking into account the substitutions (14), the equivalent deterministic uniobjetive 
mathematical programming problem to the stochastic mathematical programming (26) via 
the modified i?-model is 



niin fciE ^ Cov(y^^) 



+ /c2W Var 



Cov(y, 



subject to 

9 

Uh = 1000 

h=l 

2<nh<Nh, h = l,2,...,\ 

rih G N, 



where for G = 2 and assuming that Gov (vechCov(y )) is such that 



Gov {vech Gov(y^^) j = B ® B, 
it is obtained that, see Delannay and Gaer (2000), 



E ( c^(y,,) ) = |Nr/^^ (r[i/2] - r[3/2]) , 



and Var 



Cov(y^J 
= |N|V2 



IS 



^r[i/2] - r[3/2] + _ i (r[i/2] - r[3/2])^ 



where r[-] denotes the gamma function. 



H 



N = L(^^-^J (^^^^^lyiK-vecs.vecs,) 

Yivhi - yh)iyhi - YhY {yhi - yh){yhi - yhY 



and 



tTl!, 



1 



nh 



see Remark 5.1. 

Similarly, considering (25) and that /(•) = | • |, (20) is restated as 



min P 



Cov(y,J 



subject to 



< r 



Ynh = 1000 



h=l 



2<nh<Nh, /i = l,2,...,9 
vechCov(ygj,) -^^2x2 vech 02x2, Gov (vech Cov(y<,j,)^^ 
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Then, if ^ denotes the distribution function of the determinant of Cov{y^^), the equivalent 
deterministic problem to the stochastic mathematical programming (21) via the P-model 
is 

min r|N|^/^ 

n 

subject to 



^nh = 1000 



h=l 



2<nh<Nh, h = l,2,...,9 

Uh e N, 

where the density of Z = Cov(y ) is, see Delannay and Caer (2000) 



dGiz) 1 



1 - erf (^\/2lj 



^ > 0, 



where erf(-) is the usual error function defined as 

2 

erf(x) = / exp(— f^)(it. 



□ 



Table 2 shows the optimisation solutions obtained by some of the methods described in 
Section 4. Specifically, the solution is presented for the case when the value function is de- 
fined as the trace function, /(•) = tr(-) and for the following stochastic solutions: Modified 
i?— model, i?— model, 1/— model and the P— model. Also, the optimum allocation is included 
for each characteristic, BA and Vol (the first two rows in Table 2). The last two columns 
show the minimum values of the individual variances for the respective optimum allocations 
identified by each method. The results were computed using the commercial software Hyper 
LINGO/PC, release 6.0, see Winston (1995). The default optimisation methods used by 
LINGO to solve the nonlinear integer optimisation programs are Generalised Reduced Gra- 
dient (GRG) and branch-and-bound methods, see Bazaraa et al. (2006). Some technical 
details of the computations are the following: the maximum number of iterations of the 
methods presented in Table 2 was 2279 (modified i?-model) and the mean execution time 
for all the programs was 4 seconds. Finally, note that the greatest discrepancy found by the 
different methods among the sizes of the strata occurred under P-model. Beyond doubt, 
this is a consequence of the election of the corresponding value of r needed for the P-model 
approach. 



CONCLUSIONS 



It is difficult to suggest general rules for the selection of a method in stochastic matrix 
mathematical programming (16). These conclusions are sustained in several regards, for 
example: potentiality, there is an infinite number of possible definitions of the value function 
/(•); furthermore, the value function approach is not the unique way to restate (16); exist 
many ways to solve (16) from a stochastic point of view. We believe that this responsibility 
lies with the person skilled in the particular field and in his/her capacity of discern which 
function or approach that better reflects and meets the objectives of the study. 
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Table 2: Sample sizes and estimator of variances for the different allocations calculated 



Allocation" 


711 


n2 


713 


n4 


"5 






ns 


ng 




VaK|/2^) 


BA 


10 


94 


144 


136 


191 


113 


81 


109 


122 


5.591 


5441.105 


Vol 


7 


62 


119 


136 


200 


161 


98 


134 


83 


5.953 


5139.531 


























Modified 
























i?-modcl 


8 


46 


77 


119 


191 


191 


158 


161 


49 


7.312 


5593.494 


_E-model'' 


7 


63 


119 


135 


200 


160 


98 


134 


84 


5.937 


5139.645 


y-modcl 


8 


46 


77 


119 


191 


191 


158 


161 


49 


7.312 


5593.494 


P-modcl'^ 


632 


9 


117 


29 


46 


54 


52 


49 


7 


29.746 


20820.660 



"The estimated fourth moment 7ti| . were simulated. 
''Where fci = fca = 0.5. 
'Where r = 6000. 



In this paper, the problem of optimal allocation in multivariate stratified sampling was 
considered. In all sample size problems there is always uncertainty regarding the population 
parameters and in this work, this uncertainty was incorporated via a stochastic matrix 
mathematical solution. 
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